Molecular and phylogenetic characterization of the monkeypox outbreak in the South of Spain

Abstract Background and Aim Until the May 2022 Monkeypox (MPXV) outbreak, which spread rapidly to many non‐endemic countries, the virus was considered a viral zoonosis limited to some African countries. The Andalusian circuit of genomic surveillance was rapidly applied to characterize the MPXV outbreak in the South of Spain. Methods Whole genome sequencing was used to obtain the genomic profiles of samples collected across the south of Spain, representative of all the provinces of Andalusia. Phylogenetic analysis was used to study the relationship of the isolates and the available sequences of the 2022 outbreak. Results Whole genome sequencing of a total of 160 MPXV viruses from the different provinces that reported cases were obtained. Interestingly, we report the sequences of MPXV viruses obtained from two patients who died. While one of the isolates bore no noteworthy mutations that explain a potential heightened virulence, in another patient the second consecutive genome sequence, performed after the administration of tecovirimat, uncovered a mutation within the A0A7H0DN30 gene, known to be a prime target for tecovirimat in its Vaccinia counterpart. In general, a low number of mutations were observed in the sequences reported, which were very similar to the reference of the 2022 outbreak (OX044336), as expected from a DNA virus. The samples likely correspond to several introductions of the circulating MPXV viruses from the last outbreak. The virus sequenced from one of the two patients that died presented a mutation in a gene that bears potential connections to drug resistance. This mutation was absent in the initial sequencing before treatment.

noteworthy mutations that explain a potential heightened virulence, in another patient the second consecutive genome sequence, performed after the administration of tecovirimat, uncovered a mutation within the A0A7H0DN30 gene, known to be a prime target for tecovirimat in its Vaccinia counterpart.In general, a low number of mutations were observed in the sequences reported, which were very similar to the reference of the 2022 outbreak (OX044336), as expected from a DNA virus.The samples likely correspond to several introductions of the circulating MPXV viruses from the last outbreak.The virus sequenced from one of the two patients that died presented a mutation in a gene that bears potential connections to drug resistance.This mutation was absent in the initial sequencing before treatment.

| INTRODUCTION
Monkeypox (MPXV) is a viral zoonosis endemic in some West and Central African countries and with few cases outside Africa. 1 In May 2022, an unexpectedly large MPXV clade B.1 outbreak affecting a considerable number of non-endemic countries was reported. 2,3After the first autochthonous cases were reported in the United Kingdom on May 13 and in Spain on May 17, a rapid spread to more than 14,000 cases were reported in more than 60 countries only in the first 2 months of the outbreak, that summed up to more than 88,000 cases worldwide as of June 2023. 4though the incidence has drastically reduced in 2023, 4 the observed simultaneous MPXV incidence in different countries due to a rapid cross-border transmission 5 poses a real threat that must be addressed by robust public health surveillance and control measures. 6In fact, contrarily to previous outbreaks, the current one is mainly transmitted through sexual contact, 7 with HIV positive men overrepresented (about 50%) among MPXV-infected patients. 8Other coinfections have also been described, such as SARS-CoV-2 and MPXV. 9Thus, further research is required to delve deeper into the origins of the recent outbreak, investigating potential factors such as animal reservoirs, human behavior, or viral mutations that might be driving its occurrence. 6In particular, genetic variation plays a crucial role in the transmission of MPXV, facilitating its adaptation to various hosts and fluctuating environmental conditions, as observed in samples obtained from the Democratic Republic of the Congo, witch present genomic variations linked disease's transmissibility and severity. 10Recent research indicates that the present strains of MPXV have evolved from the B.1 lineage, showing signs of adaptation to humans. 11,12 this context, genomic monitoring of the MPXV samples sequenced from the epidemiologic surveillance in Andalusia results crucial from the epidemiological point of view, that assigns clearly the Andalusian sequences to the circulating clade.Moreover, whole-genome virus sequencing has also a relevant role in monitoring polymorphisms, as well as in detecting gene losses based on possible intragenic frameshifts or premature stop codons that could appear locally and might be relevant as virulence or enhanced transmissibility determinants.

| The Andalusian outbreak
The phylogeny of the 2022 MPXV outbreak in the context of the rest of available MPXV sequences, as displayed in the Genomic Surveillance Circuit of Andalusia, 13,14 is depicted in Figure 1.The specific features of the sequences of this outbreak and the apparently fast evolution with respect to previous outbreaks has already been discussed. 15As expected from a DNA virus, with a relatively low introduction into the general population, the samples isolated in Andalusia have only a few mutations with respect to the rest of MPXV isolates in the outbreak.Andalusian isolates are scattered across the outbreak branch in the phylogeny and are related to sequences from other countries, suggesting different introductions of the virus in Andalusia.S1 for a more detailed picture).In addition, it is worth analyzing in detail some non-synonymous mutations that appear specifically in the Andalusian samples.Supporting Information S2: Table 2 shows the nonsynonymous mutations found in the samples under study (Supporting Information S2: Table 1) with respect to the reference ON563414 with the genomic coordinates of NC_063383. 16The most common mutation occurs in 14 Andalusian MPXV isolates in the protein A0A7H0DNG7, which belongs to the Bcl-2-like protein family, which function as immunomodulators to evade the host innate immune response through the inhibition of apoptosis or blocking the activation of proinflammatory transcription factors. 17Another frequent mutation shared by 13 Andalusian MPXV occurs in A0A7H0DN47, a transmembrane protein of unknown function.Other proteins that have been found mutated in 11 MPXV isolates are A0A7H0DN82, an envelope protein which has been described as a late gene transcription factor VLTF-4 18 and M1LBQ5, shared by 11 Andalusian MPXV isolates, which is part of a large complex required for early virion morphogenesis. 19Also, A0A7H0DN66, a component of the entry fusion complex, which consists of 11 proteins and mediates entry of the virion core into the host cytoplasm and A0A7H0DNF5, a soluble interferon-gamma receptor-like protein, were found mutated in eight and six Andalusian MPXV isolates, respectively.Thus, some differences in the immune response or in the viral replication could characterize currently circulating Andalusian isolates.There are also 96 more mutations, most of them private mutations of specific MPXV isolates and a few of them shared by up to five isolates as much (see Supporting Information S2: Table 2).Among them, it is worth mentioning those found in genes A0A7H0DNG4 (in one isolate) and A0A7H0DNG6 (in four isolates), which were previously identified as virulence genes B19R and B21R, respectively, by comparing isolates of two outbreaks in Nigeria with different mortality rates. 18

| Structural variation spectrum of the Andalusian outbreak
Among the 160 sequenced samples analyzed, a total of 15 isolates displayed structural variations (see Figure 3).Within them, seven isolates exhibited distinct types of deletions, with sizes ranging from 912 to 6472 base pairs.Notably, the most prevalent deletion type, observed in four isolates, involved the partial deletion of the A0A7H0DMZ9 protein, specifically affecting the region spanning 12,143−13,055 nucleotides (see del1 label in Figure 2).This protein mediates the ubiquitination and subsequent proteasomal degradation of NF-kappa-B by targeting NF-kappa-B RELA subunit to the SCF E3 ligase complex.Ubiquitination and proteasomal degradation are cellular mechanisms that are known to be targeted or modulated by some viruses to manipulate host cell signaling pathways, evade immune responses, or regulate viral protein stability. 20Other eight isolates bear genomic rearrangement in which one terminal part of the genome was deleted and replaced by an inverted duplication of the other end of the genome.Four of these isolates exhibit a deletion of the 3′ region and an inverted duplication of the 5′ region.This rearrangement causes the partial loss of the A0A7H0DNG6 protein, mentioned above as related to virulence.
All the structural variants seem to have appeared de novo at different points of the MPXV phylogeny, and only in the case of del1 and dup2 (see Figure 2) seems to have configured clusters of MPXV isolates sharing the specific variant.Andalusian isolate (ANDmpxv00019) as a private mutation (Figure 2).
To the best of our knowledge, no instances of mortality or severe complications have been documented in association with any of these isolates, thus substantiating the neutral nature of this mutation.
On the other hand, the second patient, who exhibited immunocompromised status, underwent sequencing on two occasions.
During the initial sequencing, before the administration of tecovirimat antiviral treatment, two mutations were identified: OPG094:R194H and OPG205:E452K.The first mutation was observed in the A0A7H0DN66 gene, and has also been found in several Andalusian isolates with no apparent pathological phenotype.
However, in the second sequencing conducted after tecovirimat treatment, the OPG205:E452K mutation was no longer detected, while the OPG057:A290V mutation emerged.
F I G U R E 3 Coverage plots representing the different structural variants found.This latter mutation was observed in the A0A7H0DN30 gene, which codes for OPG057 (Figure 4), an envelope protein involved in the biogenesis of the viral double membrane and in egress of virus from the host cell.Produces the wrapped form of virus that is required for cell-to-cell spread.Acts as a lipase with broad specificity including phospholipase C, phospholipase A, and triacylglycerol lipase activities. 21,22terestingly, the homologous of the A0A7H0DN30 gene in Vaccinia, that present a high similarity of 99.46%, has been recognized as a target for tecovirimat. 23Actually, recent studies demonstrate that tecovirimat is also effective against MPXV, 24 being OPG057 protein (homologous of VP37) the target. 24,25her drugs used for smallpox treatment, such as tecovirimat, brincidofovir, and cidofovir were also used to treat MPXV. 26These drugs target different proteins, for example, cidofovir targets the DNA polymerase, 27 or have different mechanisms of action, such as that brincidofovir thwarts DNA polymerization. 28 an effort to identify molecules able of inhibiting viral entry or the replication of orthopoxviruses, certain drugs already used for tumor-related conditions, such as imatinib and mitoxantrone, as well as antibiotics like rifampicin have also been suggested. 29Drug repurposing, 30 computational approaches, 31 and artificial intelligence 32 have also been used to suggest new therapeutic options for MPXV infection.
On the other hand, no deletions or rearrangements of any kind of structural variation were found in any of the isolates from both deceased patients.

| CONCLUSIONS
The genomic changes detected in this study are important in assessing the microevolution of the circulating virus, although the functional impact of these mutations is still difficult to assess in the general context of virus circulation.In conclusion, the genomic surveillance platform currently running in Andalusia, created as a response of the COVID-19 pandemic, has enabled an extremely rapid response to monitor the spread and evolution of MPXV in the region, and to contribute to national and international genomic surveillance, to provide data and knowledge to monitor MPXV epidemics.Specific genomic surveillance criteria must be established at the national and international levels to optimize resources and increase the usefulness of its results for the control of MPXV transmission.

| Samples
Since the MPXV appeared for the first time in the South of Spain (Andalusia), in May 26, a few days later than the first report in Spain, 33 until the end of December a total of 160 MPXV complete genomes were sequenced in the Genomic Surveillance Circuit of Andalusia. 13,14The virus samples were evenly obtained across all Andalusia and originated from patients with clinical pictures suggestive of infection by the MPXV.The patient's clinical history recorded the need to include this study in the differential diagnosis of the infectious process.The information on the patient from which sequences ANDmpxv00218 and ANDmpxv00238 were isolated was taken from a previous work of the group. 34Supporting Information S2: Table 1 lists the viral genomes sequenced.

| Sequencing method
Before DNA extraction from ulcerative lesion samples using the QIAamp DNA kit (Qiagen), sonication and DNAse/RNAse treatment was performed. 15Subsequently, shotgun metagenomics was performed.In brief, DNA libraries were prepared using the Illumina DNA Prep kit (Illumina) and IDT for Illumina DNA/RNA UD Indexes sets (Ilumina).The quality of the libraries was validated by Qubit 4 fluorometer (Thermo Fisher Scientific).Sequencing was performed on Nextseq.550/1000 (Illumina).

| Data processing
Sequencing data were analyzed using in-house scripts and the nfcore/viralrecon pipeline software, 35 version 2.4.1.Briefly, after read quality filtering, sequences for each sample are aligned to the high quality MPXV isolate OX044336.2 36related to the 2022 outbreak using bowtie2 algorithm. 37Genomic variants were identified through iVar software, 38 using a minimum allele frequency threshold of 0.25 for calling variants and a filtering step to keep variants with a minimum allele frequency threshold of 0.75.Using the set of high confidence variants and the OX044336.2genome, a consensus genome per sample was finally built using bcftools. 39

| Phylogenetic analysis
Phylogenetic analysis was carried out on the obtained MPXV genomes in the context of a world-wide representative set of MPXV genomes available in NCBI virus 40 and virological.orgusing the Augur application, 41 whose functionality relies on the IQ-Tree software. 42e MAFFT program, 43 was utilized for the multiple alignment, using the isolate MPXV-M5312_HM12_Rivers (NC_063383.1) as reference.A maximum likelihood method with a general time reversible model with unequal rates and unequal base frequencies 44 was used to reconstruct the viral phylogeny.The results can be viewed in the Nextstrain Auspice 45 local server, which is now part of the Genomic Surveillance Circuit of Andalusia. 46e amino acid substitutions for each of the 160 samples sequenced in Andalusia (Supporting Information S2: Table 1) were obtained using the nextclade web application. 47Specifically using as pathogen reference "Human Monkeypox Clade B.1" (ON563414), which belongs to the same clade as the Andalusian samples but in the coordinates of the isolate NC_063383.

| Structural variations
To identify samples that could harbor large structural variants or deletions, analyses of coverage plots were conducted.Specifically, samples exhibiting regions of low coverage or regions with high coverage in the coverage plots were located. 48 confirm potential structural variants, reads from the samples displaying nonhomogeneous patterns in the coverage plots were aligned using the bwa program, 49 employing the "-a" argument to retain all alignments.The sample OX044336.2was used as the reference in this process.
The determination of the breakpoints for deletions and the insertion points for rearrangements was achieved by selecting reads that did not have an appropriate insert size.This indicated that the mate pairs mapped to both sides of a deletion or different regions in the case of duplication/rearrangement. To filter and obtain these reads, the samtools application 50 was used with the "-F14" filter, which eliminates reads that have not mapped and properly mapped CASIMIRO-SORIGUER ET AL.
| 7 of 9 reads.Furthermore, to confirm the start and end points, only reads with chimeric alignments, where a portion of the read aligned to one genomic region and another portion aligned to a different region, were retained.This selection was made using the flag "2048" to filter the reads.
Finally, in IGV, 51 it was confirmed that the accumulation of reads with inappropriate insert sizes and chimeric reads corresponded to the positions where the coverage plot exhibited changes.Federico Garcia: Resources.

Figure 2
Figure 2 portrays a detail of the phylogeny of the MPXV isolated in Andalusia (see Figure S1 for a more detailed picture).In addition, it is

F I G U R E 1
Phylogeny obtained with Nexstrain of the 2022 MPXV outbreak along with the rest of MPXV sequences available.MPXV, monkeypox.

F I G U R E 4
Schematic illustration of the MPXV genome (NC_063383.1) with the OPG057 protein coded by the gene A0A7H0DN30 in green, and a detail of the protein in the center.Red vertical line corresponds to the position of the OPG057:A290V mutation.MPXV, monkeypox.